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Abstract — Resource and cost constraints remain a challenge 
for wireless sensor network security. In this paper, we propose 
a new approach to protect confidentiality against a parasitic 
adversary, which seeks to exploit sensor networks by obtaining 
measurements in an unauthorized way. Our low-complexity 
solution, GossiCrypt, leverages on the large scale of sensor 
networks to protect confidentiality efficiently and effectively. 
GossiCrypt protects data by symmetric key encryption at their 
source nodes and re-encryption at a randomly chosen subset 
of nodes en route to the sink. Furthermore, it employs key 
refreshing to mitigate the physical compromise of cryptographic 
keys. We validate GossiCrypt analytically and with simulations, 
showing it protects data confidentiality with probability almost 
one. Moreover, compared with a system that uses public-key data 
encryption, the energy consumption of GossiCrypt is one to three 
orders of magnitude lower. 

I. Introduction 

Wireless sensor networks (WSNs) have been an active field 
of research over the last few years, with a number of technical 
issues largely resolved. Onwards wider adoption, security be- 
comes increasingly important and, eventually, security mech- 
anisms a prerequisite |23|. Numerous significant efforts have 
been made along this line, including public-key cryptography 
(e.g., ITT], [11 1) as the means to digitally sign messages and 
establish symmetric keys, as well as symmetric -key based 
encryption and authentication for improved efficiency (e.g., 
ED, ESI)- However, sensor data confidentiality has been 
largely overlooked to this date. Ensuring that sensor-collected 
data are accessed only by authorized entities has been viewed 
mostly as a secondary concern. 

Encrypting data at their source sensor node, with a symmet- 
ric key shared with the sink, is a straightforward confidentiality 
mechanism. However, it does not fully address the problem 
at hand. An adversary can actively exploit the poor physical 
protection of nodes, as it would be too costly and thus unreal- 
istic to make them tamper-resistant. It is relatively easy for an 
adversary to physically access the node memory contents |fT4l . 
and extract the symmetric key used for data encryption. Such 
an attack is vastly simpler than a cryptanalytic one against the 
key. In fact, the adversary could progressively compromise 
keys of numerous nodes, and eventually be able to decrypt a 
significant fraction of, if not all, data produced by the WSN. 

* Jun Luo and Panagiotis Papadimitratos are equally contributing authors. 



We are concerned with sensor data confidentiality in such 
a setting, where cryptographic keys can be physically com- 
promised. We focus on a novel type of adversary we term 
parasitic: it seeks to exploit a WSN, e.g., deployed for 
scientific measurements, industrial (mining, oil) field data, 
or even patients' health data collection, rather than disrupt, 
degrade, or prevent the WSN operation. A parasitic adversary, 
defined in detail in Sec. |lll] aims at obtaining measurements 
with the least expenditure of own resources, and the least 
disruption of the WSN it "attaches" itself to. Essentially, the 
longer the symbiotic relation of the adversary with a fully 
functioning WSN remains unnoticed, the more successful the 
parasitic adversary will be. 

One naive solution against (symmetric) key compromise 
is to let sensors encrypt each outgoing measurement with 
the public key of the sink. As long as the sink is not 
compromised, it is the only one able to decrypt those message 
and the parasitic adversary is thwarted. However, software 
implementations of public-key operations, albeit computation- 
ally feasible, consume energy approximately three orders 
of magnitude higher than symmetric key encryption f25l. 
Hardware implementations of public key encryption (PKE) 
can significantly reduce energy consumption, but they remain 
accordingly costlier than symmetric key encryption (SKE) 
hardware implementations (Sec. IV-Bb . 

Therefore, we are facing the challenge of protecting data 
confidentiality against parasitic adversaries in an energy effi- 
cient manner. To this end, we propose here GossiCrypt, whose 
mechanisms are tailored to and leverage on the salient features 
of WSNs. GossiCrypt comprises two building blocks: (i) a 
probabilistic en route re-encryption scheme, with the source 
node always encrypting the data and with relaying nodes en 
route to the sink flipping a coin to "decide" whether to perform 
re-encryption, and (ii) a key refreshing mechanism that installs 
new sensor-sink shared symmetric keys to selected nodes. 

Key refreshing is the immediate response to the compromise 
of a cryptographic key, but it can mitigate such an attack only 
to a certain extent: it is hard for the WSN operator to infer 
which keys were compromised. Also, running a network-wide 
key distribution protocol frequently can be very costly in an 
energy-constrained environment. More important, within two 
refreshing events, the adversary would still be fully capable 



to decrypt data from nodes whose keys were compromised. 
This is where the en route re-encryption complements our 
(infrequent) key refreshing: data (or keys) can be decrypted 
by the adversary only if all the keys used for source and en- 
route encryption are compromised. 

GossiCrypt has extremely simple key management require- 
ments and very low complexity operation. Each sensor shares 
one data encryption symmetric key with the network sink. In 
addition, a single parameter drives probabilistically the partic- 
ipation of each node in en-route encryptions. This simplicity 
is inherent in gossiping protocols, with nodes flipping a coin 
to determine, e.g., if they should synchronize their databases 
or relay a message ID, IIT2I . This inspires the name of our 
scheme, as the decision is on (re-)encrypting rather than on 
relaying a packet. Key refreshing is also simple, as it is 
performed with randomly chosen nodes. Overall, simplicity 
renders GossiCrypt broadly applicable. 

Our main contribution is an efficient and highly effective, as 
our evaluation shows, scheme to ensure sensor data confiden- 
tiality. The objectives of GossiCrypt are specified in Sec. |IV] 
We validate the effectiveness of our scheme analytically and 
experimentally. Attacked by a parasitic adversary that con- 
tinuously compromises new nodes to obtain their encryption 
keys, GossiCrypt protects the confidentiality of data with 
probability almost one. At the same time, the comparison 
with PKE shows that the GossiCrypt energy expenditure is 
significantly lower Another contribution is the introduction of 
the parasitic adversary, a realistic type of attacker for a wide 
range of commodity and tactical WSNs. To the best of our 
knowledge, this is a novel yet realistic and highly effective, 
unless thwarted, type of adversary. 

In the rest of the paper, we first provide the system and 
adversary models. Then, we present an overview of our 
scheme and present in detail its constituent protocols. In SecFV] 
and |Vl] we analyze our scheme and provide an experimental 
validation. Due to space limitation, literature survey is omitted; 
a detailed discussion of related work can be found in ifTSl . We 
discuss a number of issues related to our scheme in Sec. IVIII 
and conclude in Sec. IVIIII 

II. System Model 

The WSN comprises N sensor nodes, each with a unique 
identity Si, and a network sink Q performing data collection 
and key refreshing. It is straightforward to consider multiple 
sinks, even with distinct roles, yet we omit this for simplicity 
in presentation. Each node Si shares a symmetric key, Ki^Q, 
with the sink, and knows the public key, PuKq, of the sink. 
The sink is equipped with all Ki,Q. 

Beyond these end-to-end, sensor-to-sink, associations, nodes 
may share symmetric keys with their neighbors, to enable link- 
layer security primitives (e.g., TinySec [16]). However, such 
security mechanisms are beyond the scope of this work and 
they can clearly coexist with our scheme. 

We describe the data of interest with the help of two 
parameters, T and 5; the user seeks to collect data: 

• From a fraction < (5 < 1 of the WSN nodes. 



• Over a period of T seconds, for each node Sj, for j = 

The actual values of T and 5 can vary. T can range from 
a short period, t^, for a single sensor measurement, to a 
sufficiently long period for a comprehensive measurement 
collection. In general, T — Hq, with fc > an integer 
Similarly, 5 — 1/N, i.e., targeting at a certain node, may be 
meaningful, but in practice 5 will be a significant fraction of 
A^Q We do not dwell on the exact measurement extraction 
method, which can be performed in many ways orthogonal to 
our scheme. 

We assume that N ranges from hundreds to thousands, 
as, for example, in WSNs for commercial inventory, habitat 
monitoring, industrial and mining field data, and geological 
measurements. Experience from prior deployments, with node 
placement sparser than the monitored physical system and 
relatively long history of measurements necessary to capture 
the studied phenomena, teaches that data sensed by each and 
every node is significant. This implies that in-network data 
aggregation is not an option in such deployments; we assume 
this is the case in this work. We also assume WSNs enabling 
applications that do not undergo development. Thus, the entire 
operating system (apart from certain tunable parameters) is 
stored in read-only memory (ROM). Finally, WSN nodes are 
not tamper-resistant or store cryptographic keys in tamper- 
resistant components, due to cost considerations. 

III. Adversary Model 

We identify a new type of adversary we term parasitic. Its 
objective is to exploit deployed wireless sensor networks, by 
accessing in an unauthorized manner data collected by those 
WSNs. More specifically, a parasitic adversary: 

1 . Seeks to obtain the WSN data collected according to the 
parameters 5 and T. 

2. Can be physically present, at each point in time, only 
at a much smaller fraction of the area covered by \SN~\ 
sensor nodes. 

3. Can physically access data stored at sensor nodes and 
retrieve their cryptographic keys. 

4. Can be mobile |20J, i.e., compromise different sets of 
nodes over different time intervals. "Mobile" tradition- 
ally refers to virtual moves (in terms of compromising 
system entities); here, it also represents physical moves 
of the adversary. 

5. Can compromise in the above-described manner at most 
one sensor per r seconds. We assume t <^T. 

The characteristics of the parasitic adversary reflect its real- 
ism. Constrained presence (assumption 2) is meaningful, be- 
cause, otherwise, the adversary could deploy its own WSN and 
trivially obtain the data the WSN user collects (assumption 1). 
It exploits obvious weaknesses of WSNs (assumption 3): poor 
physical protection makes it relatively easy to obtain data 

'WSNs deployed for (often one-time) event detection (e.g., forest fire or 
bridge structural faults) would correspond to <5 = 1, and T equal to the period 
from the WSN deployment to the event/alarm occurrence. 



-Q- 



(VirlLtal) mobility 
of tlie adversary 



Time lines of data histories 



Fig. 1. Mobility of the parasitic adversary. 



encryption keys lfT4l . The parasitic adversary is unobtrusive, 
that is, cannot modify the implemented protocols stored in 
ROM (Sec. ini. Furthermore, it can utilize its resources intel- 
ligently. Mobility (assumption 4), illustrated in Fig. [T] shows 
that the adversary can be in the proximity of different nodes 
for periods of time during which it either compromises the 
node, or obtains snapshots of their measurement histories, or 
intercepts messages sent from nodes within its receiving range. 

The strength of the adversary is evident from assumption 5: 
the time needed to physically compromise a single node, albeit 
significant if nodes are carefully designed, is much shorter than 
T, the period over which data are to be collected. In other 
words, the benefit of the adversary from compromising sensor 
nodes is far reaching. The adversary could remain within 
range of the compromised node and trivially intercept all its 
transmissions. But such an attack would be self-defeating: 
from assumption 2, the adversary would certainly capture 
much less than \5N~\ measurements. From a different point 
of view, assumption 2 captures the difficulty to deploy a 
network of eavesdroppers within one hop of all previously 
compromised nodes. The eavesdroppers' transceivers would 
need to be highly sensitive (and thus more expensive than 
that of a sensor node) to cover a meaningful fraction of the 
targeted WSN. Overall, leaving "sentry" nodes behind would 
be comparable to deploying a WSN by the adversary. 

We assume that the protocol design and implementation are 
such that remote node compromise is prevented. For example, 
the adversary cannot exploit arbitrary software weaknesses 
and make a sensor node disclose its cryptographic keys. 
Such robustness should be possible given the relatively simple 
functionality of WSN node software, compared to that of 
more complex systems (e.g., desktop or portable computers). 
We also assume that the sink cannot be compromised by 
the adversary. Readers are referred to |29| for the investi- 
gations on compromise of low-end mobile sinks. Moreover, 
denial-of-service (DoS) attacks, including jamming in various 
protocol layers |28|, Sybil/Node replication attacks [22], or 
"wormhole" formation |[2TI are beyond the scope of this 
work: countermeasures to those attacks can coexist with our 
protocols. Neither do we consider physical destruction of WSN 
nodes, which would not benefit the adversary. 

IV. GossiCrypt 

GossiCrypt aims at ensuring confidentiality, that is, prevent- 
ing any unauthorized access to data collected by a WSN. It 
does not seek to protect data coming from every single sensor. 




Fig. 2. Securing data collection with GossiCrypt and query authentication 
(/iTESLA [32] for example). 



but rather intends to fulfill the following property, for some 
protocol-specific constant < A < 1: 

At^— Confidentiality: Data collected from a WSN comprising 
N nodes are At ^confidential if the adversary cannot obtain 
all measurements performed by more than [iVA] sensor nodes 
over a given time interval T. 

This is a safety property, i.e., a property related to a system- 
specific unwanted situation: obtaining measurements from a 
given fraction of sensor nodes over a period of time, meaning- 
ful with respect to the system and application, is prevented. In 
Sec. IV-Al we will show that GossiCrypt satisfies this property 
against parasitic adversaries with probability almost one. 

We emphasize that GossiCrypt does not seek to provide 
sensor data authenticity and integrity. The reason is that if a 
key is compromised, an adversary (not necessarily a parasitic 
one) can impersonate the corresponding sensor and inject 
fabricated messages. Nonetheless, data that originate from 
non-compromised nodes have their authenticity and integrity 
protected. We also clarify that GossiCrypt does not seek to 
hide the identities of sensor nodes, achieve data source un- 
traceability, or satisfy any notion of anonymity, unlinkability, 
or privacy. Clearly, confidentiality relates to privacy, but, again, 
all GossiCrypt seeks to provide is the confidentiality of the data 
provided by sensor nodes. 

A. Data Encryption 

We distinguish sensor nodes into two types, data sources 
and relaying nodes, with each node assuming either role at 
different points in time. We denote by GossiCrypt^ the data 
encryption operation of GossiCrypt. As illustrated in Fig. |2] 
it is executed by nodes on the path from a data source to a 
sink (inclusive), with the outcome (i.e., re-encrypting or not) 
at each relaying node being random (with probability q). 

The path may be one hop, if the sink is within the transmis- 
sion range of the sensor node, but this is not cost-effective; in 
general, the sink is at a distance of multiple hops from data 
source(s). The path discovery is orthogonal to GossiCrypt ^. It 
can be determined by a (secure) routing protocol, for example, 
forming an authenticated tree rooted at the sink fl^, possibly 
on-the-fly, as a result of the query sent out from a sink. 
GossiCrypt ^ can be employed on top of any path discovery 
protocol and does not impose extra requirements. For the rest 
of the discussion, we assume that, minimally, each Si knows 
the next node towards on a pathsi,e without the transmitted 
packet carrying the routing information. 



For a sensor measurement m, a symmetric key Ki_Q shared 
by 8 and Si, a message authentication code MAC{Ki,Q, . . .), 
and q e (0, 1) the protocol-specific parameter governing the 
en route re-encryption, GossiCrypt^{Ki^Q, pat/is.^ei 9: 
is invoked by Si acting as a source: 

1. Source node, S^: 

l.a. Generate a nonce n for the communication with 
sink 8. 

l.b. Calculate H = MAC{Ki^e,m,n, Si). 

1. e. Encrypt m,n,H with Ki^Q to obtain ciphertext 

a^ = {m,n,H}Ki^0- 
1 .d. Transmit packet pi — di, Si to the first relaying 
node Sj on pathsi^e- 

2. Relaying node, S^: 

2. a. Upon receipt of a packet pi, generate a random 

number x e [0, 1]. If a; > g, relay pi to the next 
relaying node 5*^ on paths^^Q, or to 8. Otherwise, 

2.b. Generate ciphertext aj — {pi^K e- 

2.C. Append own identity Sj to (jj. 

2. d. Relay packet pj — aj, Sj to the next relaying node 

5*^ along paths j,e, or to 8. 

3. Sink 8: 

3. a. Upon receipt of a packet pk, retrieve Kk.e, the key 

shared with Sk, and decrypt ak- If the source. Si, 

cleartext m, n, H, is obtained, go to (c). Otherwise, 
3.b. Obtain ciphertext cr; and Si. Decrypt a; with Ki^q. 

Repeat successively for all Si that re-encrypted the 

packet, till obtaining the source clear-text m, n, H. 
3.C. Determine if n was previously seen. If so, discard 

the packet. Otherwise, 
3.d. Compute H' = MAC{Ki^e,m, n, Si). Discard the 

packet if H' ^ H. Otherwise, deliver m to the 

WSN user. 

B. Key Refreshing 

To defend against the progressive compromise of an in- 
creasing number of nodes, Ki.Q keys should be refreshed, i.e., 
replaced with new K[ q keys. The sink is typically unaware 
of which nodes are already compromised. Thus, it selects 
randomly an Si node to refresh, among a set of N' < N nodes. 
This selection is, in general, made among the data source 
nodes of interest (the 6 fraction of N as defined in Sec. |ll]l, 
and all the intermediate nodes that connect those sources to 
the sink. In other words, the refreshing effort focuses on the 
same part of the network that is meaningful for the adversary 
to target. 

Given a particular system design for the nodes, it is not 
very difficult to have an arguably pessimistic estimation of 
the rate of physical node compromise, as per Sec. Then, 
based on this estimate of t~^, the key refreshing rate Xr can 
be selected accordingly by the sink, and conveyed to all nodes 
via an authenticated control message. Confidentiality of is 
not needed, as the adversary would, at best, compromise nodes 
at its maximum possible rate r~^. Authenticity, however, is 



clearly required, to ensure that an active adversary does not 
"slow down" the key refreshing. 

Symmetric -key based key transport techniques, similar to 
those in [1], are effective only if the adversary, having previ- 
ously compromised Ki,Q, cannot intercept the key refreshing 
protocol messages. Moreover, an interactive key establishment 
protocol, for example, initiated by the sink, would reveal 
the identity of the node whose key is being refreshed. The 
adversary could eavesdrop all messages sent and received from 
the sink, and hence gain a significant advantage: that is, know 
which nodes were refreshed and then re-compromise them. 

To thwart these two vulnerabilities, we propose a key 
refreshing protocol with two variants. This is essentially a key 
transport protocol; but it leverages on (i) the GossiCrypt ^ 
operation, with optional public key encryption at the source 
sensor node, and (ii) the integration of the key refreshing with 
the data collection. As a result, the key refreshing protocol is 
similar to the data encryption protocol, presented in Sec. IIV-AI 
There are two main differences: a random point process gen- 
erator f6l, RGen{\r), used to generate (key refreshing) events 
with intensity A^, and a flag set to indicate to the sink that a 
new key K[ q is included in the message (which, otherwise, 
externally appears identical to any measurement/data reporting 
message). The protocol operates as follows: 

1. Source node. Si: 

l.a. Upon an event of RGen{Xr), generate a new key 
K'^ g,; wait for the time till the next data report. 

1 .b. Upon a data report to be returned, delay the report 
to be combined with the next one, and generate a 
nonce n for the communication with sink 8. 

I.e. Calculate H = MAC{K,^e,flag, K[ q, n, S^). 

l.d. Encrypt flag, K'^ Q,n, H with Ki^Q, to obtain ci- 
phertext (Ti = {flag,K'-Q,n,H}K,,e. 

I.e. Transmit packet pi — (Ti,Si to the first relaying 
node Sj on paths^^e. 

2. Relaying node, 5*^: 

Identical to the operation for GossiCrypt^ (Sec. II V- Al l. 

3. Sink 8: 

3. a. Perform the steps (3).(a)-(b) as specified in 
Sec. IIV-AI to obtain the source, 5*^, cleartext 
flag,K'-Q,n,H. 
3.b. Determine if n was previously seen. If so, discard 

the packet. Otherwise, 
3.C. Calculate H' = MAC{K,^e,flag,Kl0,n,S^). If 
H' H, discard the packet. Otherwise, replace 
K,^e with K'^ Q. 
The protocol installs a new key even if the adversary inter- 
cepts the message en route to the sink, unless the adversary is 
physically within one hop from the previously compromised 
and now to-be-refreshed 5,. In the later case (which is rare 
due to the constrained physical presence of an adversary), the 
adversary can decrypt the message and obtain the key. To 
prevent this, we propose the following variant of the above 
key refreshing protocol: 

1. Source node. Si'. Identical to the above key refreshing 



operation, with the additional step between (b) and (c), 
and replacing K'^ q with <TKi afterwards: 
l.b+. Encrypt K'^ q with PuKq, the public key of 
the sink, and obtain the ciphertext aKi — 

{Si,K'.Q}puKe- 

2. Relaying node, Sj-. 

Identical to the operation for GossiCrypt^ (Sec. lIV-Al i. 

3. Sink 9: Identical to the above key refreshing operation, 
with the additional step: 

3.d. Decrypt aKi with PtKq, the private key of the 
sink, and check if the obtained node identity is 5*^. 
If so, replace Ki Q with K[ q. 
This second variant's use of PKE resembles mechanism 1 of 
the ISS/IEC 11770-3 standard [2|. It ensures that even in the 
unlikely event the adversary is within one hop of the refreshed 
node, still, it cannot obtain the new K[ q. The only option for 
the adversary would be to re-compromise Si. 

V. Protocol Analysis 

We analyze the security level of GossiCrypt and also 
compare its energy expenditure with a possible alternative in 
this section. Our security analysis focuses only on the parasitic 
adversary; further discussion on other adversaries is given 
in Sec. IVIII and [IS]. The security analysis applies to both 
data encryption and key refreshing (with or without PKE) 
protocols, as they follow the same principle. 

A. Security Analysis 

In this section, we describe a model of GossiCrypt and 
evaluate it against the Ay— Confidentiality property (Sec. lIVI i 
and the parasitic adversary (Sec. Ulll i. Our analysis, accompa- 
nied by simulation results in Sec. |VI] shows that even with a 
significant fraction of sensor nodes compromised, GossiCrypt 
safeguards confidentiality with probability almost one. 

Fundamental for the analysis is the fraction of correct, i.e., 
not compromised, nodes; this is determined by the behaviors 
of the sink refreshing and the adversary compromising keys. 
Therefore, we model the state of the system, the number of 
correct nodes, as a stochastic process. Our security analysis on 
GossiCrypt is based on the stationary regime of this process. 

Since the sink cannot in general know which keys are 
already compromised, a randomized strategy on selecting 
which node to refresh is a reasonable choice. We assume that 
the sink does so with an effectiv^ refresh rate A. Recall that 
the sink governs the selection procedure through setting the 
parameter A^. The adversary, compromising nodes at rate r^^, 
is also modeled as selecting the next node to compromise (or 
to test if the key was refreshedjl arbitrarily. This is so, because 

^The model covers the two options (with or without PKE) of the key 
refreshing protocol described Sec. IIV-BI Although the key refreshing without 
PKE might allow the adversary to obtain the new key, it is still highly possible 
that new keys are not exposed to the adversary, as the adversary cannot be 
ubiquitously present (also pointed out in 1 3 1). Thus, the model still applies 
but with the refreshing rate A,- discounted by a factor. 

'a model that assumes the rate of testing differing from that of compro- 
mising does not fundamentally change the stationary distribution. 



the adversary is also in general unaware of which keys were 
refreshed by the sink0 Although an adversary physically close 
to a source node Si, may detect a key-refreshing, its physical 
presence is limited to a negligible fraction of the network. 
Note that re-encryption deprives the adversary from this ability 
elsewhere. The aforementioned assumptions suggest that both 
the sink and the adversary follow Markov chains ID in 
choosing the next target. In particular, the adversary may 
follow a deterministic trajectory, which is a special Markov 
chain with deterministic transitions. 

The system size depends on the behavior of the sink. If the 
sink is static and the data collection paths change slowly, if 
at all, over time, both the sink and the adversary could have 
a clear view on which nodes they need to target: the source 
sensor nodes of interest and the relaying nodes en-route to the 
sink. Or better even, from the adversary's point of view, the 
slightly smaller subset of sources and relaying nodes en-route 
to the point it intercepts the measurement packets. As a result, 
the system is this known subset of nodes with size N' < N. 
On the other hand, if a mobile sink is used ifTSl . 12611 . flj], the 
adversary cannot predict the data collection paths. This results 
in a larger system size, which essentially can be all nodes, 
offering higher robustness against the adversary at the expense 
of complexity in operating the mobile sink. We emphasize 
however that our analysis is applicable to both cases. All one 
needs to do is to view N below as the effective system size. 

We assume that the times of performing refreshing and 
compromising can be modeled as two independent Poisson 
processes with intensities A and r^^ respectively. We also 
assume that, at each time point in the processes, either the sink 
approaches a node and refreshes it or the adversary captures 
a node and compromises it, no matter whether the node has 
been compromised or not. The Poissonian and independence 
assumptions are not essential. The easily drawn analogies 
between our model and the teletraffic models |5| imply that the 
stationary distribution is insensitive to all other characteristics 
beyond the intensities. 

Based on these assumptions, we describe the system states 

"^In a static sink network, the adversary might gradually, over a long 
period of eavesdropping, infer (part of) the communication paths connecting 
the sensor nodes to the sink. This could allow the adversary to launch 
a deterministic attack (e.g., starting from the sink's neighbors and then 
moving outwards, compromising their upstream nodes). This might allow the 
adversary to fight back against symmetric-key based refreshing if and only if 
it has compromised the entire path connecting the refreshed node to the sink. 
However, this attack would be completely ineffective against a public key 
based refreshing (as described in Sec. lIV-Bt . The only approach that could 
allow the adversary to detect if some node 5^ re-encrypted a message with a 
new key (that does not allow the adversary to decrypt the message and then 
can guide its re-compromise), would be to intercept the message before it is 
received and after it is relayed by 5^ . But this would imply physical presence 
of the adversary along the entire path and eventually the source node(s). 
This would contradict assumption 2. Therefore, the deterministic, targeted 
compromise pattern would be essentially impossible and thus pointless, and 
thus no more effective than a random one. We note that it is also possible that 
the sink counters deterministic attack patterns with similarly structured refresh 
patterns. However, investigation of those albeit interesting is not provided here 
due to space limitations. For example, the efficiency of the scheme could 
greatly enhanced if the public key refreshing protocol is run with nodes near 
the sink, to "break" chains of fully compromised paths and make symmetric- 
key refreshing effective even against this deterministic attack. 



as a continuous Markov chain {X{t)}t>o driven by the 
Poisson processes. Since such a chain is characterized by its 
subordinated chain {Xn}n>o [6], we focus on this discrete 
Markov chain. A direct observation on the system is that 
the more numerous the compromised nodes, the less the 
efficiency of the adversary (thus the higher the efficiency 
of the sink) is and vice versa. The reason is clear: when 
many nodes are compromised, the probability of fruitlessly re- 
compromising becomes high. This reminds us of the celebrated 
model described by Paul and Tatiana Ehrenfest (sometimes 
referred to as The Urn of Ehrenfest) |9| for understanding 
the diffusion through a porous membraneQ The system we 
consider differs from the Urn of Ehrenfest in that the "self" 
transition probability is non-zero (i.e., pu > 0) and also that 
the transition probability depends on the rates A and r^^. 
Therefore, the transition matrix of the subordinated chain 

{Xn}n>Q is as folloWS! 

So 1^0 
111 Si Vi 



where i is the number of correct nodes in the system, 
-TTT and 



^J■^ = jVr(A+r-t) ^« = n{\+t^-^) represent the transitions 



resulting from a compromising and a refreshing, respectively, 
= jVr(A+r"t) + N(x+T-i) expresscs those fruitless 
operations. One can easily see that this is a birth-and-death 
process in continuous time with reflecting barriers at and 
N |6|. The chain {Xri}n>o is irreducible (i.e., every state is 
reachable from all other states) and positive recurrent (i.e., the 
system does not freeze at some states). It has the following 
stationary distribution (the detailed computation is omitted): 

-1 



7^0 



1 



VqVi 



VqVi 



TT,; = TTo - 



Ail Mi/-*2 
i^oz^i • • • i^i-i 



(1) 



(2) 



AilM2 • • - Mi 

Note that this is also the stationary distribution of {X{t)}t>o- 
It has the following properties: 

• The system can rarely be free either of correct nodes 
iX{t) = 0) or of compromised nodes iX{t) = N), 
because both ttq and ttjv vanish with increasing TV. 

• The most likely state (i.e., argmax^TTi) lies between 
and N; it depends on the magnitude of A and r^^. The 
larger the value of At (the ratio between the rate of 
refreshing and that of compromising) is, the closer is this 
state to N. 

These two properties can be easily observed in Fig. [3] It 

^The model can be briefly described as foflows |6|: there are N particles 
that can be either in compartment A and B. Suppose at time t, there are i 
particles in A. The diffusion process behaves as if someone chooses a particle 
at random and moves it to another compartment at time t + 1. Therefore, 



the transition probability is pij = (j = j — l),or ' (j 
l),or (otherwise). 
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Fig. 3. Stationary distribution vr with A'^ = 100, A = 1, and r = 0.6, 1, 1.5. 
The y-axis is the probability density corresponding to a certain number of 
correct nodes. Since only the product At matters, we choose the values of A 
and r arbitrarily without a dimension. 



shows that even if the sink is more efficient than the adversary 
(At — 1.5, the red curve), there are still approximately 40% 
compromised nodes. 

Now, we evaluate the probability of having at least one 
correct node re-encrypting the data on a routing path of length 
L from a source to the adversary. Let a random variable Y be 
the number of correct nodes re-encrypting the data and hence 



M < L 



(3) 



where M is the random variable representing the number of 
nodes that re-encrypt the data and {51,n} are i.i.d. Bernoulli 
variables indicating the state of each of the M nodes {fl„i = 
1 if correct and otherwise). We want to calculate P{Y > 
0} = 1 — P{Y — 0}, the success probability (in the sense that 
GossiCrypt successfully provides confidentiality). To this end, 
we make use of the generating function gviz) of Y, because 
P{Y = 0} = 9y{0) and, by the rule of random sum of i.i.d. 
variables [6\, griz) = gM{9n{z)). Therefore, 

P{r = 0} = guigm) 

= EM[p{f^o = on 



L 



(4) 



Given the stationary distribution tt of {X{t)}t>o, 



N 



p{no = o} = J2^{n,x{t) = i} 



i=Q 
N 



} , — r;— = — (5) 



i=0 



N 



N 



where 11 is the event of picking a node within N — i compro- 
mised ones. We illustrate the success probability P{Y > 0} 
under different values of L and q in Table |I] assuming 
N — 100, A = 1, and t — 1.5. One might think the case 
where P{Y > 0} = 0.8258 (for L = 5 and q = 0.5) 



is an unfavorable bet for the legitimate user (because the 
adversary is able to decrypt the data with probability 0.1742); 
the adversary, however, gains nothing from this. To understand 
this point, we refer again to Fig. [T] Since what the adversary 
might decrypt (with probability 0.1742) is just a snapshot, the 
probability of observing the whole data history goes to zero 
(the probability of obtaining three snapshots is already very 
low: 0.1742^ = 0.0053). Note that we take for granted that the 
events of decrypting two different snapshots are independent; 
this is guaranteed by the coin flipping procedure even if two 
snapshots are transmitted through the same routing path. 





0.5 


0.6 


0.7 


0.8 


0.9 


5 


0.8258 


0.8875 


0.9303 


0.9590 


0.9773 


6 


0.8772 


0.9273 


0.9591 


0.9783 


0.9894 


7 


0.9134 


0.9531 


0.9760 


0.9886 


0.9950 


8 


0.9390 


0.9697 


0.9859 


0.9940 


0.9977 


9 


0.9570 


0.9804 


0.9917 


0.9968 


0.9989 


10 


0.9697 


0.9873 


0.9951 


0.9983 


0.9995 


11 


0.9786 


0.9918 


0.9871 


0.9991 


0.9998 


12 


0.9849 


0.9947 


0.9983 


0.9995 


0.9999 



TABLE I 

Success probability P{Y > 0} under different values of L 
(path length) and q (coin flip probability). 



We analyzed to this point the system state process and 
the per-message protection due to GossiCrypt given the path 
length L. In general, i is a random variable. If we knew its 
probability distribution P{L), the probability of breaking the 
confidentiality of a single measurement (T — to) from a given 
node ( A = 1 /N) would be 



1 = E41 - P{F > 0}] 



(6) 



What we are interested though, as per our specification, is the 
confidentiality with respect to any A > and T = kto 

for integer fc > 1. Clearly, it depends on P{L) that is a 
complicated consequence of the relative placement of the sink 
and sources, as well as the patterns by which the adversary 
compromises nodes and the sink refreshes them. As a result, 
we proceed without making an assumption on P{L) and 
describe the property of GossiCrypt in an asymptotical sense. 

Claim: GossiCrypt guarantees the AT-Confldentiality prop- 
erty for A > with probability V (with N being the system 
size), and V 1 when T 3> to- 

Proof: As it is at least as hard to breach the confi- 
dentiality of two or more measurements as that of a single 
one, it is clear that J-'ta,A < — ^'^^ ™y ^ W- 
strict inequality holds if the events of compromising two or 
more measurements are independent. Furthermore, we have 
that Tta = {^IoaY foi" r = fc*o,fc > 0. Therefore, 
V = 1 — Tt a > 1 — [Tt 1 ^'^ — + 1 if fc cxD. In other words, 
as fc grows, the probability of safeguarding the confidentiality 
of A measurements over a period T goes to one. Literally, 
if the data history to be captured is sufficiently long, there 
is virtually no opportunity for the adversary to succeed in 
breaking its confidentiality. ■ 



As shown in Fig. [3] it is always preferable to have At > 1 
(although At < 1 can be compensated by aggressively setting 
q). This is not hard to achieve because, whereas the adversary 
obtains keys via its physical presence, the key refreshing is 
performed automatically and remotely. A conservative way to 
achieve this is to estimate Tmin (the lower bound of t) and to 
set A > Tj^j^^. Estimating t online can be preferable. We also 
note the the convergence of V persists even if At < 1 but, of 
course, with a lower speed. 

B. Energy Expenditure 

As we mentioned in Sec. U applying PKE is an alternative 
solution to thwart a parasitic adversary. We will show in this 
section that, a sound in theory PKE-based solution is inferior 
to GossiCrypt due to the much higher energy expenditure it 
incurs. 

For a quantitative comparison between PKE and Gos- 
siCrypt, we make the following assumptions: 

1. The network size N < 2^^, so node identity Si needs 
at most 16 bits. 

2. Each message has a length of 20 bytes. 

3. GossiCrypt makes use of AES-128 encryption. 

4. The PKE can either be RSA-1024 or ECC-160@ 

5. The energy expenditure for transmission is 0.21 /iJ/bit. 

The transmission cost refers to MICA2 nodes, and so are 
the computation delays for cryptographic operations, and the 
related power dissipation, based on available experimental 
results. Note that the fourth assumption strongly favors PKE, 
with its 80-bit security compared with the AES 128-bit se- 
curity level. The energy costs are taken from f25\. Although 
hardware implementations could significantly reduce energy 
consumption for all primitives llT3l . (lU, ifTOl . the order of 
difference is maintained. 

Table compares GossiCrypt with two variants of PKE in 
terms of computatioij^ and communication complexity. 





GossiCrypt 


PKE-RSA 


PKE-ECC 


Comp. 


32.4 fij/msg 


14.1 mJ/msg 


53.4 mJ/msg 


Comm. 


An increase of 16g bits 
per message per hop 


1024 bits 
per message 


320 bits 
per message 



TABLE II 

Comparison between GossiCrypt and PKEs . 



We have the following observation on Table [III First, the 
energy expenditure in computation of GossiCrypt at a source 
node is 2 to 3 orders of magnitude lower than the those of 
PKEs. Second, the energy expenditure in communication of 
GossiCrypt for each node en-route remains lower than those 

'Rabin PKE, in theory, is more efficient than RSA (though the difference 
can be as low as one modular multiplication for low RSA exponent operations) 
1 19|. However, we are not aware of sensor network software implementations 
for Rabin PKE. Moreover, Rabin appears to be costlier than RSA certain 
implementations in other platforms |7 |. 

'The computational complexity is measured in different units for 
symmetric-key and public-key encryption in [25]. So we need to fix the 
message size in order to compai'e them. 



of PKEs up to lOq-^ (for PKE-ECC) and 54q-^ (for PKE- 
RSA) hops (note that q < 1). 

It is clear that the communication cost of GossiCrypt is 
lower than that of PKE-ECC below lOq^^ hops and that 
of PKE-RSA below 54q^^ hops. We assume the scale of 
the WSN meets these criteria and we only compare the 
computation cost below. Note that assuming 20 bytes message 
actually favors PKE-ECC, whose cost would be doubled if, for 
example, the message were one byte longer. 

The additional computation cost for GossiCrypt compared 
with PKE stems from key refreshing; we denote it as Cicfrcsh- 
Based on the analysis in Sec. IV-AI let us assume refresh rate 
equal to the adversary compromise rate (i.e., Ar = 1). For 
T = kto, let r = T/k as per the definition of the parasitic 
adversary, or in other words, the adversary compromises one 
node per measurement period to- Then, for a (sub-)network 
of N nodes among which the sink picks randomly, each node 
will be refreshed on the average once every N measurement 
periods. The advantage for GossiCrypt per source node is ap- 
proximately the ratio of ^^';^^+'^"^'''"°'' « n±i_cgc_ without 

^ ■' iVXCpKE JV CpKE 

public-key encryption (as cqc « Crofrosh) or w ^J^j^ with 
public -key encryption (as -^^^ ^ 1), where cqc ™d cpke are 
the computation costs for GossiCrypt and PKEs, respectively, 
given in Table Ull 

As the advantage of GossiCrypt over PKEs is tremendous 
without public-key encryption, we only consider the key 
refreshing with ECC-based public-key encryption. In this case, 
the cost of refreshing is dominated by one ECC encryption, 
thus w 1. Therefore, the ratio i-^iEftssk decreases as 

CpKE Jy CpKE 

grows, thus making GossiCrypt increasingly advantageous. 
For example, if = 100, GossiCrypt can be 100 times less 
costly then PKE-ECC. For PKE-RSA, Crcfrosh ~ 3cpke and 
GossiCrypt is still 33 times less costly. However, the very high 
communication cost of PKE-RSA is a significant disadvantage 
that makes PKE-RSA infeasible. 

The comparison above might seem unfair, as one could 
argue that using PKE on a per-message basis is not neces- 
sary; for example, PKE could be used only to "transport" a 
symmetric key from each source sensor node to the sink. Then, 
such end-to-end symmetric keys could be the only ones to be 
used to encrypt once data measurements only at the source. 
Clearly, such symmetric keys would be used for numerous 
subsequent data messages, followed by a new key transport. 
However, as we emphasized in Sec. |l] such conventional 
key refreshing does not fully thwart the parasitic adversary: 
between two refreshing events, the adversary would still be 
fully capable of compromising nodes and hence decrypting 
their data. Therefore, to reach the security level achieved by 
GossiCrypt, conventional key refreshing has to be performed 
frequently for almost all nodes. Given our assumption that the 
adversary compromises one node per measurement period to, 
without GossiCrypt all N (symmetric) keys would have to be 
refreshed every to. Since Crefresh > cpke in general, it would 
be more efficient to just use PKE on a per-message basis. 



VI. Experiment Results 

We perform simulations in Matlab. We only simulate the 
operations of GossiCrypt without taking the MAC/PHY effects 
into account. We assume a grid network where nodes appear 
on a \/]V X ViV square lattice. The movement^ of both the 
sink and the adversary follow a 2D random walk: they take 
identical probability 1 /4 in choosing one direction out of four 
possibilities. The intervals between two successive events of 
moving follow exponential distributions with mean A^^ and r 
for the sink and the adversary, respectively. We assume N = 
100, A 1, and t = 1.5. To remove the boundary effect, 
we project the lattice on a torus, i.e., moving out of the one 
side of the lattice leads to entering on the opposite side. We 
illustrate these settings in Fig. |4] 
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Fig. 4. Simulation settings. 

Since the stochastic process described above can be proved 
to be aperiodic and positive recurrent, all the states are 
ergodic |6|. Therefore, we can use statistics over time to char- 
acterize the stationary distribution. We run each simulation for 
1 1000 transitions and truncate the first 1000 points (which are 
in transient phase), such that the results are measured in steady 
state. Fig. |5] shows the comparison between four empirical 
stationary distributions resulting from four simulation runs and 
the analytical one obtained in Sec. IV-AI It is clear that the 
analytical results describe the stationary regime of the system 
very well. 

Based on these statistics, we can again verify the success 
probability P{Y > 0} by randomly choosing routing paths be- 
tween nodes and the adversary. For brevity, we only illustrate 
the case with L = 6 in Fig. |6] (showing the medians and 95% 
quantiles) and compare the results with the analytical ones 
shown in Table U The comparison shows that the analytical 
results are a bit overoptimistic, but the differences with the 
experiment results are negligible. 

Finally, we verify our claim that GossiCrypt guarantees 
the AT-Confidentiality property with probability almost one 
when T = kto is sufficiently long. To this end, we randomly 
pick two nodes on the grid and consider one as the source 
and the other as the data collector By applying GossiCrypt to 
the shortest path between the two nodes, we can evaluate the 

*We note tliat the sink may make a virtual movement by simply changing 
the target of the key refreshing protocol, but the adversary has to always 
physically move to a node to launch its attack. 
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Fig. 5. Stationary distributions of tlie number of correct nodes. 
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Fig. 6. Successful probability P{Y > 0} as function of the GossiCrypt 
parameter q. 



quantity Tj^f^ x. for different values of k. As shown in Fig.|7] 
this probability converges very fast to zero with an increasing 
k, according to both simulation and analytical results. This 
corroborates our claim that V = I — J-t.a 1- 

To summarize our results in the analysis of Sec. lV-Al and the 
experiments of this section: we showed that, for any protocol- 
or application-specific objective A > the confidentiality 
of the sensed data can be safeguarded with probability almost 
equal to one. Although this seems to require that a sufficiently 
high number of measurements (or equivalently long period 
T) are of interest, analytic and experimental values show that 
even very short sequences (e.g., T = 5to) of measurements 
originating from a single source node can be protected with 
probability fast approaching one. This is achieved thanks to 
the GossiCrypt en-route encryption, resulting in particularly 
robust operation even when approximately 40% of the nodes 
are compromised by the adversary (as shown by Fig. |5]l. 

VII. Discussion 
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As described in Sec. IIV-BI the key refreshing protocol 
does not provide reliable communication. Hop-by-hop re- 



Fig. 7. The probability of breaking the confidentiality of k measurements 
from a given node J^, . i as function of k. 



transmissions can remedy transient packet loss, but it may 
still be possible that a key refreshing message sent from 
a node Si to the sink Q is lost. In that case, 9 would 
be unable to decrypt messages Si encrypts with the new 
("refreshed") symmetric key. A multi-round sensor node-sink 
communication protocol, to confirm at both ends the key 
refreshing was successful, would not be an option. This is so 
as the sink response could single-handedly divulge the node 
that performed the refreshing, and thus enable to adversary to 
target the node and re-compromise it to obtain the new key. 
As a result, we propose here a straightforward solution: to 
add limited redundancy only for the infrequent key refreshing 
messages. One option is to let Si repeat the same message a 
few times; in the presence of benign faults the probability 
of successful reception will be practically one with a few 
repetitions. Depending on the underlying networking protocol, 
if, for example, nodes form a directed acyclic graph rather 
than a tree, each node could transmit key refreshing message 
replicas to different neighbors and thus across different paths. 
Of course, adding redundancy leads to higher overhead; for 
example, instead of transmitting one key refreshing message 
over N nodes per to seconds, r would be transmitted, but 
GossiCrypt is still advantageous as r ^ iV. 

The impact of active adversaries is discussed next. After 
compromising a key, they can impersonate Si, and invoke a 
fake key refreshing^ The adversary could then establish a new 
shared key with the sink. At first, the impersonating adversary 
would be constrained in terms of where to invoke the fake 
refreshing from, as the 5^-10-0 path is essentially accumulated 
in the key refreshing message. Independently of that, however, 
once the actual key refreshing occurs. Si will operate with 
a different key from its impostor The unobtrusive adversary 
cannot prevent Si from launching a key refreshing protocol, 
and it cannot upload its own "new" key to S'i. As a result, even 

'Public key cryptography (e.g., digital signatures generated by a source 
node Si) would not be advantageous: the private key of Si can be obtained 
by an adversary that physically compromises Si. 



if the adversary controls the Si-to-Q communication, it can at 
most deny data collection from Si. But the active adversary 
would fail to obtain the data Si reports encrypted with the 
actual new key, unless it re-compromises physically Si. 

As a foUow-up work, we intend to consider specific in- 
stantiations of WSNs, e.g., network sizes and topologies, data 
extraction and key refreshing methods, and value ranges for 
other system characteristics such as S, T, and A, and r and 
A. Extending our work in this way, through analytical and 
experimental means, would allow us to investigate a number 
of interesting questions. For example, postulate fine-grained 
claims conditional on specific networks, revealing design 
trade-offs due to the relative roles of A and T. Or, identify the 
right "mix" of symmetric- and public-key based key refreshing 
techniques, as a function of the adversary presence, to evaluate 
the trade-off of effectiveness for cost. 

VIII. Conclusion 

As security becomes an important requirement for WSNs, 
the salient characteristics of WSNs clue the more relevant 
threats and types of exploit to thwart with practical defense 
mechanisms. With this consideration in mind, we identify here 
a novel threat, a parasitic adversary, targeting exactly the most 
valuable asset of a WSN, its measurements. The parasitic 
adversary is a practical and realistic threat because of (i) its 
well-aimed exploit, unauthorized access to WSN data, (ii) its 
well-chosen methods, targeting at the weakest system point, 
the low physical sensor node protection, and (iii) its resource 
constraints and "low-profile" operation. 

The second and main contribution of this paper is Gos- 
siCrypt, a scheme to ensure WSN data confidentiality. Gos- 
siCrypt's two building blocks are a probabilistic en route 
encryption of the data towards the sink and a key refreshing 
mechanism, both leveraging on the scale of WSNs. The former 
relies on very simple key management assumptions, it is 
simple in operation. The latter reverses the impact of the 
physical compromise of sensor nodes. 

Our evaluation shows that GossiCrypt can prevent the 
breach of WSN confidentiaUty in a wide range of settings. 
Even though the adversary could obtain soUtary or sparse mea- 
surements, our analysis and simulations show that GossiCrypt 
prevents the compromise of a meaningful set of measurements 
over a period of time with probability going to one. The 
most intriguing feature of GossiCrypt Ues in its ability of 
defending the WSN data confidentiality with simple and low- 
cost mechanisms. We beUeve that such approaches that lever- 
age on the WSN characteristics, rather than imitating iron- 
clad approaches from other distributed computing paradigms, 
can be effective in addressing security challenges for wireless 
sensor networks. 
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